Search Results/Filters    

Filters

Year

Banks



Expert Group










Full-Text


Issue Info: 
  • Year: 

    2022
  • Volume: 

    52
  • Issue: 

    4
  • Pages: 

    281-291
Measures: 
  • Citations: 

    0
  • Views: 

    154
  • Downloads: 

    18
Abstract: 

Automatic topic detection seems unavoidable in social media analysis due to big text data which their users generate. Clustering-based methods are one of the most important and up-to-date categories in topic detection. The goal of this research is to have a wide study on this category. Therefore, this paper aims to study the main components of Clustering-based-topic-detection, which are embedding methods, distance metrics, and Clustering Algorithms. Transfer learning and consequently pretrained language models and word embeddings have been considered in recent years. Regarding the importance of embedding methods, the efficiency of five new embedding methods, from earlier to recent ones, are compared in this paper. To conduct our study, two commonly used distance metrics, in addition to five important Clustering Algorithms in the field of topic detection, are implemented by the authors. As COVID-19 has turned into a hot trending topic on social networks in recent years, a dataset including one-month tweets collected with COVID-19-related hashtags is used for this study. More than 7500 experiments are performed to determine tunable parameters. Then all combinations of embedding methods, distance metrics and Clustering Algorithms (50 combinations) are evaluated using Silhouette metric. Results show that T5 strongly outperforms other embedding methods, cosine distance is weakly better than other distance metrics, and DBSCAN is superior to other Clustering Algorithms.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 154

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 18 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2018
  • Volume: 

    4
Measures: 
  • Views: 

    153
  • Downloads: 

    0
Abstract: 

WITH THE RAPID DEVELOPMENT OF THE WORLD WIDE WEB AND INCREASING THE VOLUME OF INFORMATION, WEB RESEARCH HAS BECOME AN IMPORTANT RESEARCH AREA. WEB MINING RESEARCH IS MAINLY CATEGORIZED INTO TWO TYPES OF WEB CONTENT MINING AND WEB USAGE MINING. AN IMPORTANT TOPIC IN WEB USAGE MINING IS THE Clustering OF USERS IN OTHER WORDS, GROUPING THESE USERS INTO CLUSTERS BASED ON THEIR COMMON FEATURES. IN THIS PAPER, USING K-MEANS, KOHONEN, AND TWO-STEP METHODS, WE CLUSTERED THE USERS INTO GROUPS WITH SIMILAR CHARACTERISTICS AND USED THE PRINCIPAL COMPONENT ANALYSIS METHOD TO ENHANCE Clustering QUALITY AND USE THE SILHOUETTE CRITERION TO ASSESS Clustering QUALITY. AMONG THESE THREE METHODS, K-MEANS WITH TWO CLUSTERS HAD THE HIGHEST QUALITY AND THE DATA SET WAS CLUSTERED AND ANALYZED USING THIS METHOD. BY ANALYZING THE CLUSTERS, CAN GET A BETTER UNDERSTANDING OF THE USERS AND PROVIDE CUSTOM AND MORE CONVENIENT SERVICES FOR THEM.

Yearly Impact:   مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 153

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0
Author(s): 

MURTAGH F. | CONTRERAS P.

Journal: 

VIRTUAL

Issue Info: 
  • Year: 

    621
  • Volume: 

    1
  • Issue: 

    1
  • Pages: 

    86-97
Measures: 
  • Citations: 

    1
  • Views: 

    144
  • Downloads: 

    0
Keywords: 
Abstract: 

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 144

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 1 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2009
  • Volume: 

    -
  • Issue: 

    9
  • Pages: 

    20-25
Measures: 
  • Citations: 

    1
  • Views: 

    139
  • Downloads: 

    0
Keywords: 
Abstract: 

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 139

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 1 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Author(s): 

SHIEH H.M. | MAY M.D.

Issue Info: 
  • Year: 

    2001
  • Volume: 

    18
  • Issue: 

    3
  • Pages: 

    1-12
Measures: 
  • Citations: 

    1
  • Views: 

    184
  • Downloads: 

    0
Keywords: 
Abstract: 

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 184

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 1 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2022
  • Volume: 

    19
  • Issue: 

    4
  • Pages: 

    95-120
Measures: 
  • Citations: 

    0
  • Views: 

    72
  • Downloads: 

    8
Abstract: 

Data Clustering is one of the main steps in data mining, which is responsible for exploring hidden patterns in non-tagged data. Due to the complexity of the problem and the weakness of the basic Clustering methods, most studies today are guided by Clustering ensemble methods. Diversity in primary results is one of the most important factors that can affect the quality of the final results. Also, the quality of the initial results is another factor that affects the quality of the results of the ensemble. Both factors have been considered in recent research on ensemble Clustering. Here, a new framework for improving the efficiency of Clustering has been proposed, which is based on the use of a subset of primary clusters, and the proposed method answers the above questions and ambiguities. The selection of this subset plays a vital role in the efficiency of the assembly. Since evolutionary intelligent Algorithms have been able to solve the majority of complex engineering problems, this paper also uses these intelligent methods to select subsets of primary clusters. This selection is done using three intelligent methods (genetic algorithm, simulation annealing and particle swarm optimization). In this paper a Clustering ensemble method is proposed which is based on a subset of primary clusters. The main idea behind this method is using more stable clusters in the ensemble. The stability is applied as a goodness measure of the clusters. The clusters which satisfy a threshold of this measure are selected to participate in the ensemble. For combining the chosen clusters, a co-association based consensus function is applied. A new EAC based method which is called Extended Evidence Accumulation Clustering, EEAC, is proposed for constructing the Co-association Matrix from the subset of clusters. Experimental results on several standard datasets with normalized mutual information evaluation, Fisher and accuracy criteria compared to Alizadeh, Azimi, Berikov, CLWGC, RCESCC, KME, CFSFDP, DBSCAB, NSC and Chen methods show the significant improvement of the proposed method in comparison with other ones.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 72

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 8 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Author(s): 

KUMAR VINAY

Issue Info: 
  • Year: 

    2011
  • Volume: 

    8
  • Issue: 

    -
  • Pages: 

    0-0
Measures: 
  • Citations: 

    1
  • Views: 

    126
  • Downloads: 

    0
Keywords: 
Abstract: 

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 126

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 1 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2021
  • Volume: 

    20
  • Issue: 

    2
  • Pages: 

    29-42
Measures: 
  • Citations: 

    0
  • Views: 

    26
  • Downloads: 

    3
Abstract: 

Recently, some statistical studies have been done using the shape data. One of these studies is Clustering shape data, which is the main topic of this paper. We are going to study some Clustering Algorithms on shape data and then introduce the best algorithm based on accuracy, speed, and scalability criteria. In addition, we propose a method for representing the shape data that facilitates and speeds up the shape Clustering Algorithms. Although the mentioned method is not very accurate, it is fast; therefore, it is useful for datasets with a high number of landmarks or observations, which take a long time to be clustered by means of other Algorithms. It should be noted that this method is not new, but in this article we apply it in shape data analysis.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 26

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 3 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2023
  • Volume: 

    1
  • Issue: 

    4
  • Pages: 

    6-25
Measures: 
  • Citations: 

    0
  • Views: 

    72
  • Downloads: 

    14
Abstract: 

Nowadays, Clustering plays an important role in most research fields such as engineering, medicine, biology, data mining, etc. In fact, Clustering means unsupervised division. By using it, the data are divided into categories that are more similar to each other in terms of the parameters of interest. One of the famous methods in this field is k-means. In this method, despite the dependence on initial conditions and convergence to local optimal points, N numbers of data are grouped into k clusters with high speed. In this article, to solve the existing problems, the combined method is used based on evolutionary Algorithms, chaos theory and k-means,that is in addition to solving the mentioned problems, it will also be independent of the number of variables. In this article, for the purpose of validation, the proposed methods are implemented on 13 different famous collections, and the results are compared with genetic algorithm, particle community, bee colony, simulated refrigeration, differential evolution, harmony search, and k-means methods. The high ability and robustness of these methods will be evident based on the results.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 72

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 14 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2022
  • Volume: 

    52
  • Issue: 

    3
  • Pages: 

    205-215
Measures: 
  • Citations: 

    0
  • Views: 

    136
  • Downloads: 

    23
Abstract: 

Distance-based Clustering methods categorize samples by optimizing a global criterion, finding ellipsoid clusters with roughly equal sizes. In contrast, density-based Clustering techniques form clusters with arbitrary shapes and sizes by optimizing a local criterion. Most of these methods have several hyper-parameters, and their performance is highly dependent on the hyper-parameter setup. Recently, a Gaussian Density Distance (GDD) approach was proposed to optimize local criteria in terms of distance and density properties of samples. GDD can find clusters with different shapes and sizes without any free parameters. However, it may fail to discover the appropriate clusters due to the interfering of clustered samples in estimating the density and distance properties of remaining unclustered samples. Here, we introduce Adaptive GDD (AGDD), which eliminates the inappropriate effect of clustered samples by adaptively updating the parameters during Clustering. It is stable and can identify clusters with various shapes, sizes, and densities without adding extra parameters. The distance metrics calculating the dissimilarity between samples can affect the Clustering performance. The effect of different distance measurements is also analyzed on the method. The experimental results conducted on several well-known datasets show the effectiveness of the proposed AGDD method compared to the other well-known Clustering methods.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 136

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 23 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
litScript
telegram sharing button
whatsapp sharing button
linkedin sharing button
twitter sharing button
email sharing button
email sharing button
email sharing button
sharethis sharing button